Semantic Similarity Calculation of Chinese Word

نویسندگان

Liqiang Pan

Pu Zhang

Anping Xiong

چکیده

This paper puts forward a two layers computing method to calculate semantic similarity of Chinese word. Firstly, using Latent Dirichlet Allocation (LDA) subject model to generate subject spatial domain. Then mapping word into topic space and forming topic distribution which is used to calculate semantic similarity of word(the first layer computing). Finally, using semantic dictionary"HowNet" to deeply excavate semantic similarity of word(the second layer computing). This method not only overcomes the problem that it’s not specific enough merely using LDA to calculate semantic similarity of word, but also solves the problems such as new words(haven’t been added in dictionary) and without considering specific context when calculating semantic similarity based on semantic dictionary "HowNet". By experimental comparison, this thesis proves feasibility,availability and advantages of the calculation method. Keywords— semantic similarity; LDA; subject model; HowNet

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Research of Chinese Semantic Similarity Calculation Introduced Punctuations

So far, most Chinese natural language processing neglects the punctuations or oversimplifies their functions. To improve the efficiency of Chinese similarity computing, this paper gives a Chinese similarity computing system model in accordance with the problems of Chinese sentence similarity computation aspect. This model is a combination of punctuations and traditional similarity computing. Co...

متن کامل

Study of Chinese Text Similarity Based on Difference Factor in Word-Number

Text similarity calculation is the basic work in the application of Chinese information processing. A highquality text similarity calculation method must be accurate and efficient, that is, it can be able to compare texts from the level of text natural language meaning, and arrive at the similarity distinction similar to artificial reading based on a full understanding of the author or text sou...

متن کامل

Word Semantic Similarity Calculation Based on Domain Knowledge and HowNet

Word semantic similarity is the foundation of semantic processing, and is a key issue in many applications. This paper argues that word semantic similarity should associate with domain knowledge, which traditional methods did not take into account. In order to adopt domain knowledge into semantic similarity measurement, this paper proposed a sensitive words sets approach. For this purpose, we a...

متن کامل

The Research of Chinese Words Semantic Similarity Calculation with Multi-Information

Text similarity has a relatively wide range of applications in many fields, such as intelligent information retrieval, question answering system, text rechecking, machine translation, and so on. The text similarity computing based on the meaning has been used more widely in the similarity computing of the words and phrase. Using the knowledge structure of the and its method of knowledg...

متن کامل

Keyword Extraction From Chinese Text Based On Multidimensional Weighted Features

This paper proposed to solve the problems of incomplete coverage and low accuracy in keyword extraction of Chinese text based on intrinsic feature of the Chinese language and an extraction method of multidimensional information weighted eigenvalues. This method combined theoretical analysis and experimental calculation to study the parts of speech, word position, word length, semantic similarit...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Semantic Similarity Calculation of Chinese Word

نویسندگان

چکیده

منابع مشابه

The Research of Chinese Semantic Similarity Calculation Introduced Punctuations

Study of Chinese Text Similarity Based on Difference Factor in Word-Number

Word Semantic Similarity Calculation Based on Domain Knowledge and HowNet

The Research of Chinese Words Semantic Similarity Calculation with Multi-Information

Keyword Extraction From Chinese Text Based On Multidimensional Weighted Features

عنوان ژورنال:

اشتراک گذاری